Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions
نویسندگان
چکیده
We consider the issue of query performance, and we propose a novel method for automatically predicting the difficulty of a query. Unlike a number of existing techniques which are based on examining the ranked lists returned in response to perturbed versions of the query with respect to the given collection or perturbed versions of the collection with respect to the given query, our technique is based on examining the ranked lists returned by multiple scoring functions (retrieval engines) with respect to the given query and collection. In essence, we propose that the results returned by multiple retrieval engines will be relatively similar for “easy” queries but more diverse for “difficult” queries. By appropriately employing Jensen-Shannon divergence to measure the “diversity” of the returned results, we demonstrate a methodology for predicting query difficulty whose performance exceeds existing state-ofthe-art techniques on TREC collections, often remarkably so.
منابع مشابه
Discrimination Measure of Correlations in a Population of Neurons by Using the Jensen-Shannon Divergence
The significance of synchronized spikes fired by nearby neurons for perception is still unclear. To evaluate how reliably one can decide if a given response on the population coding of sensory information comes from the full distribution, or from the product of independent distributions from each cell, we used recorded responses of pairs of single neurons in primary visual cortex of macaque mon...
متن کاملActive Learning for Probability Estimation Using Jensen-Shannon Divergence
Active selection of good training examples is an important approach to reducing data-collection costs in machine learning; however, most existing methods focus on maximizing classification accuracy. In many applications, such as those with unequal misclassification costs, producing good class probability estimates (CPEs) is more important than optimizing classification accuracy. We introduce no...
متن کاملA Note on Bound for Jensen-Shannon Divergence by Jeffreys
We present a lower bound on the Jensen-Shannon divergence by the Jeffrers’ divergence when pi ≥ qi is satisfied. In the original Lin's paper [IEEE Trans. Info. Theory, 37, 145 (1991)], where the divergence was introduced, the upper bound in terms of the Jeffreys was the quarter of it. In view of a recent shaper one reported by Crooks, we present a discussion on upper bounds by transcendental fu...
متن کاملAlpha-Divergence for Classification, Indexing and Retrieval (Revised 2)
Motivated by Chernoff’s bound on asymptotic probability of error we propose the alpha-divergence measure and a surrogate, the alpha-Jensen difference, for feature classification, indexing and retrieval in image and other databases. The alpha-divergence, also known as Renyi divergence, is a generalization of the Kullback-Liebler divergence and the Hellinger affinity between the probability densi...
متن کاملBounds on Non-Symmetric Divergence Measures in Terms of Symmetric Divergence Measures
There are many information and divergence measures exist in the literature on information theory and statistics. The most famous among them are Kullback-Leibler [13] relative information and Jeffreys [12] Jdivergence. Sibson [17] Jensen-Shannon divergence has also found its applications in the literature. The author [20] studied a new divergence measures based on arithmetic and geometric means....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007